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Abstract 

We propose two algorithms for solving convex optimization problems with linear 
ascending constraints. When the objective function is separable, we propose a dual 
method which terminates in a finite number of iterations. In particular, the worst case 
cornplexity of our dual method improves over the best-known result for this problem 
in [4]. We then propose a gradient projection method to solve a more general class 
of problems. The gradient projection method uses the dual method as a subroutine 
in each projection step and does not need to evaluate the inverse gradient functions 
as most dual method do. Numerical experiments show that both our algorithms work 
very well in test problems. 



1 Introduction 

In this paper, we consider the following optimization problem: 

(PI) minimize^; = /(yi, ?/„) (1) 

subject to Ya=i Vi < SLi V/c = 1, n - 1 (2) 

0<2/i</3i, Vi = l,...,n, (4) 

where F(-) is jointly convex in y = (yi, and < < +oo, < /Jj < +oo, for 

2=1, ...,n. We make the following contributions in this paper. 

1. We develop a dual method to solve a special case of (PI) with separable objective 
functions and ([3]) being an inequality constraint. Our dual method stops in a finite 
number of iterations and improves the worst case complexity over the algorithm 
ini. 

2. Using the dual method as a subroutine, we propose a gradient projection method 
to solve (PI). Our proposed method takes advantages of the structure of the 
constraints so that each projection step can be completed efficiently. The gradi- 
ent projection method also allows non-separable objective functions and equality 
constraint in ([3]). 
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3. We perform numerical experiments on several test problems. The test results 
show that our proposed algorithms outperform the algorithm in ^ as well as 
other standard methods in most problems. 

1.1 Equivalent Form 

We first point out an equivalent form of (PI) which is frequently used in the literature: 
(P2) minimize^; G{y) = g{yi, ...,yn) 

subject to ELi ^ Ei=i "i' V/c = 1, - 1 

EILi y^ = i>) EILi "i 

0<yi<Pi V« = l,...,n, (5) 

where G{y) is jointly convex in y. To translate ([5]) into (PI), we define Zi = j3i — yi, 
and replace yi by Zj, then the optimization problem becomes: 

minimizei- F{z) = G(/3i - zi, - z„) 

subject to Ei=i < Ei=i(ft - "j)> V/c = 1, n - 1 

Er=i^« = (<)Er=i(ft-«o 

< Zi < f3i, Vi = l,...,n, 

which is exactly of form (PI). 

1.2 Applications 

The formulation (PI) arises in many applications. One example which is a problem of 
smoothing is discussed in 0]. Another one that arises in a special case of network flow 
problems is studied in 0] and Both these two examples have the form of (P2) with 
G{y) = T2i ^iVi- Other problems arise frequently in communications and are discussed 
in [2], Uand 0]. Here we present another motivating application of this model in 
operations management. 

Inventory problem with downward substitution. A firm sells a product with 
n different grades, with 1 the highest and n the lowest. The firm has CKj grade i products 
on hand and is facing a random demand Di for each grade i. Any product of grade i 
can be used to satisfy the demand of product of grade i or lower (j > i). Before the 
demand realizes, the firm has to make an inventory decision yi of how much grade i 
product to put into stock. Once this is done, the products are no longer substitutable 
(for example, the firm has to package these products during this process, products of 
different grades need different packages and will not be distinguishable after packaging). 
For each grade i, there is a unit overage cost Oj if Di is less than yi and a unit underage 
cost Ui if Di is greater than y^. The objective is to minimize the expected total cost. 
The problem can be written as: 

minimize^- EILi {uiE{Di - yi)-^ + OiE{yi - Di)+) 

subject to Ei=i Vi < Ei=i V/c = 1, n 

yi > 0, Vi = l,...,n, (6) 

which is of form (PI). In practice, one might have strong inventive to solve ([6]) faster. 
For example, the firms may also need to decide the upfront production quantities aiS 
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for each grade with production cost Qaj. The actual yield of grade i product is aiUi 
where C/j's are some known random yield distributions. In this case, the firm's problem 
is the following two-stage stochastic programming problem: 

n n 

minimize^ ^ Ciai+ Eu{mimmizeg ^ [uiE{Di - yi)+ + OiE{yi - A)"^) } 
1=1 1=1 

k k 

subject to y-t < ajf^i, V/c = l,...,n 

i=l i=l 

yi>0, Vi = l,...,n. (7) 



A natural approach to solve ([7|) is to use the stochastic gradient method [9|. However, 
this requires one to evaluate the inside problem repeatedly. Therefore, improving the 
efficiency of solving ([6|) could be of strong interest. 



1.3 Literature Review 

The main related literature to this paper is 0]. In 0], the authors propose a dual 
method for solving (P2) with separable objective functions. We call the algorithm 
in [2] the "P-S algorithm" in the rest of the discussions. The P-S algorithm finishes 
in 0(n) outer iterations. In each iteration, it solves up to n nonlinear equations, 
and sets at least one primal variable based on the solutions to the equations. The 
efficiency of the P-S algorithm depends on how fast one can solve those equations. 
When the equations have close form solutions, the P-S algorithm performs very well, 
otherwise, it may not. In this paper, we propose a dual algorithm which does not 
attempt to set primal variables in each iteration. Instead, we set one dual variable in 
each iteration and maintain the optimality conditions for the variables that have been 
set. Our dual algorithm also finishes in 0{n) outer iterations and in each iteration, we 
solve no more than one equation. We show that the equations we solve are simply the 
equations in the P-S algorithm with lower bound on each term. When the equations in 
P-S algorithm do not have a close form solution, solving both equations usually have 
the same complexity. In those cases, our dual algorithm reduces the computational 
complexity of the P-S algorithm by an order of n. 

In addition to the dual method, we propose a gradient projection method to solve 
the more general problem (PI) allowing non-separable objective functions. Gradient 
projection methods are widely used to solve a variety of convex optimization problems. 
We refer the readers to [ij] for a thorough discussion of this method. In particular, the 
key element in gradient projection method is the design of the projection step. To my 
best knowledge, this paper is the first one to study the projection step under linear 
ascending constraints. Therefore, the result may be of independent interest from this 
perspective. 

Another popular method that solves nonlinear convex optimization is the interior 
point method. However, we focus on the first order method in this paper because of 
its low memory requirement and thus the ability to solve large problems. Performance 
comparisons between our proposed algorithms and the interior point algorithm (im- 
plemented by CVX) are shown in the numerical tests and the results indicate that our 
algorithms are usually much more efficient. 

We note that there is abundant literature on solving a special case of (PI) when 
there is only one equality/inequality constraint (usually called the simplex constraint or 
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h constraint). We refer the readers to Ul] for a survey on this problem. Although the 
dual method is widely used in those studies, the detail of our algorithm differs because 
of the special structure of this problem. As we show later, the specific structure of the 
constraint is the key to make our algorithm work. 



1.4 Structure of the paper 

In Section [21 we develop a dual method to solve a special case of (PI) with separable 
objective functions and ([3]) being an inequality constraint. In Section [3l we further 
propose a gradient projection method to solve the general problem (PI). Numerical 
tests are shown in Section |4] to examine the performances of our algorithms. Section [5] 
concludes this paper. 



2 A Dual Method 

In this section, we study a special case of (PI) in which the objective function F 
is separable, i.e., F{y) = 'Yll=ifi{yi) ™d ([3]) is an inequality constraint. There are 
two reasons why we consider separable objectives. First, in most of the applications 
mentioned in Section 11.21 the objective functions are indeed separable. Second, the 
study of separable objective functions will lay the foundation for the analysis of the 
gradient projection method in Section [3] which can solve more general problems. 
In the following, we develop a dual method to solve the following problem: 

(P3) minimize^; Yh=i fiiVi) (8) 
subject to Ya=i Vi < Y^i=i Oj, VA: = 1, n (9) 
0<yi<Pu Vi = l,...,n. (10) 

Here we assume that /i(-)'s are continuously differentiable and define gi{x) = fl{xj^. 
Without loss of generality, we assume that /?j < J2k=i'^k- Furthermore, we as- 
sume that gi{-) is strictly increasing with ^^(O) = k and gi{fii) = hi. We define 
yi = arg mmQ<:y<i^- fi{y), that is, y^'s are the optimal solution to (P3) without con- 
straint ([9|). Under the above assumptions, it is easy to see that yi exists and is unique. 

We first construct the KKT conditions of (f8l)- (fT0]) . We associate a dual variable Afc 
to each constraint ([9]), a dual variable 6i to each upper bound constraint, and a dual 
variable rji to each nonnegative constraint (jlOp . The Lagrangian of (j8])- (jl0p can then 
be written as 

n n / k k \ n n 

i=l k=l \i=l 1=1 / i=l 1=1 

^ Our algorithm works in a similar manner even if / is not differentiable but convex. The discussions will 
involve subgradient of / in that case. We make this assumption simply for the convenience of discussion. 
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And the KKT conditions are 



OiiVi) = -'^ + ili - 5i, Vz = l,...,n, (11) 

k=i 

yi-ili = 0,yi>0,r]i>0, Vz = l,...,n, (12) 
{f3i-yi) ■ 6i = 0,yi< I3i,5i>0, V2 = l,...,n, (13) 

k k 

Y^y^^Y.^'i^ V/c = l,...,n, (14) 

i=l 1=1 
/ k \ 

■ =°'^'^ VA; = l,...,n. (15) 



=1 i=l 



Define <f>i{x) = max{/j, min{x, /ij}} and Hi{x) = g^^ {(f)i{x)) . By the assumptions on 
gi{-), U and /ij, we have < Hi[x) < Pi. And conditions ()lip -(jl3 p can be equivalently 
written as 
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y^ = Hi[-Y,^k] (16) 

with 



k=i 



\ \ k=i / k=i J \ V k=i / k=i J 

where x^ = max{x,0}. Since (P3) is hnearly constrained and is convex, solving the 
KKT conditions are equivalent as solving the original problem [l^ . In the following, 
we propose an efficient dual method to solve the KKT conditions. The idea of this dual 
method is to assign values to the dual variables A's such that the optimality conditions 
(jl4p - (|16p hold. We state our algorithm as follows: 



Algorithm 1 

Step 0: Initialization. Let = Vi — Yl!i=i k = 1,2, n. Define 

Wo = 0, 

wi = mm{k : dk > 0}, 

Wj+i = mm{k > Wj : dk > d^i,.}. 

Here we define min = oo. If wi = oo, then setting r/i = yi and Aj = for all i will sat- 
isfy the KKT conditions and thus is optimal. Otherwise, let L = max{j >l:wj< oo}. 
Define S = {wi,W2, ...,wl}- Let Aj = 0, r^j = for all i and let j = L. 

Step 1: Main Loop (Outer Loop). 
WHILE j > 

• Case 1: If 

E (- E ) -"0 -° ^^^^ 

S=U>j_l + l \ \ t=j + l I I 
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then choose ^ > such that 



L 



=i«j_i+i \ \ t=j+i 



and set A^^,^. = j = j - 1. 
Case 2: If 



E - E - «0 ^ ° ^^^^ 
s=«>,_i+i \ \ t=j+i / / 

then use binary search to find 

r* = mmlj + l<r<L: |] f f" E " > i • (20) 

If such r* does not exist, then set all X^^ = 0, for r = j + 1, L and j = j — 1- 
Otherwise, choose ^ > such that 

E f- E =0 (21) 

s=iOj_l+l V V t=r*+l j J 

and set \w^, = C and A^^ = 0, for r < r*. Set j = j — I- 
END WHILE 

Step 2: Output Set 

yi = ^- E ' 



= " E + E = -'/' - E " E • 

\ \ k=i J k=i J \ \ k=i J k=i / 



First, we argue that those ^'s defined in (jlSp and (j2ip exist. This can be verified by 
observing that when = 0, the left hand sides of (fTSll and (f2T]) are both nonnegative 
and as ^ — >• cx), both of them will be less than zero. Also, by our assumption, the left 
hand sides of (fTH]) and (f2T]) are both continuous. Therefore, by the intermediate value 
theorem, such ^'s must exist. We now state the main result of this section. 

Theorem 1 Algorithm 1 terminates within L < n outer iterations and the output 
minimizes (3^-11^). 
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In the following, we prove Theorem [H Let {A*}^=i, {r?*}^^i, {(5*}^=i) be 

any solution to the KKT conditions (jl4p -(|16 p and thus an optimal solution to the 
original problem. First, it is easy to see that y* < yi for all i, otherwise replace with 
yi will strictly improve the objective value while still satisfying the constraints. Next 
we claim that for any k ^ S, we must have = 0. This is because for wi^i < k < wi, 
we have 

k k k 

J2{y* -ai)< (y^ - < Yl - < 

i=l i=-!il;_i+l j=tO;_i + l 

where the last inequality is because of the definition of wi. By the complementarity 
condition (fT5]l . = for wi^i < k < wi. 

Note that in the KKT conditions, given A^'s, the y's, r/'s and J's are uniquely 
determined and that changing A^. only affects y^'s, T^j's and (5j's for i < k. In each 
iteration of Algorithm 1, we assign one new A^,; and may modify all A^j.'s for k > 
I + 1. We now state the following property of Algorithm 1 which immediately implies 
Theorem [TJ 

Proposition 1 When Algorithm 1 finishes loop j (j = L, L — 1, 1 ), the current Xi 's 
together with 



hA-Y,^A, (22) 



k=i 



U[-T.^k] +Y.Xk] and Si= l-ct)i-Y,Xk] (23) 

satisfy the following conditions: 

Vi ■ i]i = 0,Vi> 0, yi > 0, > Wj^i + 1, (24) 

(A - y^) ■Si = 0,yi< ft, 6, > 0, Vi > wj.i + 1, (25) 



Y ys< VA;>u;j„i + l, (26) 

S=UIj_l + l S=Wj-l + l 

(k k \ 

s=«)j_i+i s=uij_i+i y 

Before we prove Proposition [H we introduce a lemma that will be used repeatedly 
in the proof. 



Lemma 1 If y^ is defined in \2S^) for some nonnegative A^ 's, then yi < yi. 



The lemma follows immediately from the assumption that 5i(-)'s are strictly increasing 
and that yi = Hi{0). 



Proof of Proposition [1} First, note that condition (j24p and (j25p are satisfied for 
all j's because of the definitions in (j22p and ()23p . Therefore, it suffices to show that 
conditions ()26p and ()27p hold for j = L, L — 1, 1. We use backward induction for 
this. First we show that for j = L, ()26p and ()27p hold for all k > wl~i + 1- 

First we show that (|26p holds. When Algorithm 1 finishes loop L, for any wl-i + ^ < 
s' < wl, we have 

s' s' 

^ iVs -as)< - ^ 0' (28) 

where the first inequality is due to Lemma [1] and the second inequality is due to the 
definition oi wl. On the other hand, for s' > wl, we have 

s' Wl s' 

J2 ^y^^ -Ols) < ^ {Vs -Ols)+ ^ {Vs - Ois) < 0, 

S=«'L-1+1 S=10£_1 + 1 S=1i)£ + 1 

where the first inequality is because of Lemma [1] and the second one is because of the 
definition of w^. Therefore (j26p holds when j = L. 

To show that ([27]) holds for j = L, note that among all the A^'s with k > wl-i + ^, 
the only possible non-zero one is Xw^- If Case 1 of the algorithm happens in this loop, 
then 

{Vs - Ols) = 0. 

S=WL~l+l 



Otherwise, Xw^ = 0. Therefore, (l27|l holds for j = L. 

Now we assume that (I26p - ()27p hold after the algorithm completes the outer loop 
for j = j + 1. Now we consider the situation when it finishes the outer loop for j = j. 
We consider two cases: 

• Case 1: (|17p holds in the current loop (j = j). In this case, we have 

w-^ WJ 

And the y^'s for s > Wj does not change from the previous loop. Therefore, for 
any k = Wj (j > j), we have 

k k 

Y y^^ J2 

S=-!iIJ_i+l S=WJ_i+l 

And for wj < k < wj+i {j > j — 1), 

k Wj k Wj 

Y - as) < Y ^y^" ~ + Y - «s) < Y ^y^ ~ - 

S=lOJ_x+l S=W-j_]^+l S=Wj+l 
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where the first inequahty is because of Lemma [T] and the second inequahty is 
because of the definition of Wj^s. Therefore, (j26p holds for all k > Wj_i + 1. For 
(|27p. we only need to study k = wj since all other A^'s are 0. And it holds because 
of (j29p and the induction assumption. Therefore, (|26p - (|27p hold for j in this 
case. 



Case 2: ()19p holds in the current loop. Then there are two further cases: 

— a): r* does not exist. In this case, by the definition of Algorithm 1, all 
Afc's are zero after this iteration and ^4.1 Us < Z]s=«)- 1+1 '^s ^'^'^ 
all k = Wj (j > j). By the same argument as in case 1, we know that 
Es=n,j_,+i Vs < Es=u,j_,+i »s for all k > wj_, + 1. Therefore, m - m 
hold for j in this case. 

— b): r* exists. Denote the A's and y's after the previous loop by A and y. It 
is easy to see that in this case, A, < Aj and yi > jji for all i. We first show 
the following lemma whose proof is referred to Appendix lAl 

Lemma 2 Au,^, > 0. 

With Lemma H we show that ([26]) - ([27]) hold. We first consider By 
([2T]) . we have 

iVs - as) = 0. (30) 



Therefore for k > Wr*, we know that 



k k 



^ {Vs - as) = ^ ivs - as) = ^ {ys - as) = ^ {ys - as) < 0, 

S=Wj_i+l S=Wj.*+l S=W^*+l S=Wj + l 

where the second equality is because that ys does not change for s > Wr* + 
1. The last equality is because of the induction assumption that A^^, • 
Yl7=w-+i^y^ ~ ^s) = and Lemma [21 Therefore, for all wj < k < Wj+i, 
j — 1 < j < r*, we have 

k Wj Wj.* Wj 

- as) < Y -as) <- Y -as)= ^ {ys - as) < 0, 

S=UI^_j^+l S=Wj_-i+l s=Wj + l s=Wj + l 

where the first inequality is because of the definition of wj , the second equal- 
ity is because of (pOj) and y^ > jji and the last equality is because of the 
induction assumption and Lemma [2l Therefore, (|26p holds in this case. 
Lastly, we show that ()27p also holds. It suffices to show that for each r > r* 
such that Ar- > 0, 

Y iVs - as) = 0. 
This is equivalent as showing that for each r > r* such that A^ > 0, 

s=w*+l 
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Note that 




ivs -as)= - as) = - - Y ~ 




By induction assumption and Lemma [21 




Y - as) = Y - as) = 0. 



S=WJ + 1 S=WJ + 1 



Therefore (j27p holds in this case and Proposition [T] is proved. □ 
Now we make some comments on Algorithm 1. 

By its definition, Algorithm 1 terminates within L < n outer iterations. In practical 
problems, L might be much less than n. In those cases, the algorithm can output the 
solutions very fast. This is a similar property as in the P-S algorithm (recall we use 
P-S algorithm to refer the algorithm proposed in Q]). Now we use I to denote the 
complexity (number of arithmetic operations) of solving (|18p or (j2ip once (it is easy to 
see that I > n). In each iteration of Algorithm 1, if Case 1 happens, the algorithm has 
to perform a sum of no more than n terms. And it has to solve (jlSp once. Therefore, 
there are 0{I) arithmetic operations in this case. If Case 2 happens, then the algorithm 
has similar tasks as in Case 1, and in addition it needs to find r* defined in (j20p 
which takes no more than O(nlogn) iterations. Therefore, the complexity in Case 
2 is 0(max{nlogn,X}). Combined with 0{n) outer iterations , the total arithmetic 
complexity of our algorithm is 0(max{n^ log n, nl}). 

Now we compare the complexity result to that of the P-S algorithm. The difference 
between the two algorithms is the way the variables are assigned. In each iteration of 
the P-S algorithm, it solves 



for all I < j, where s is the set of unassigned variables. Then the largest solution 
is chosen and the corresponding primal variable is set accordingly. Such a method 
avoids the needs to check the validity of the KKT conditions that is met in previous 
steps as we have to do in Step 2 in Algorithm 1, however at a cost of having to solve 
0{n) equations at each step rather than only one as in Algorithm 1. Indeed, the 
equations (j3ip are sometimes easier to solve since they don't involve the lower bound 
as Algorithm 1 do. If one denotes the arithmetic complexity of solving equations in 
(j3ip by O(Z'), then the total arithmetic complexity of the P-S algorithm is 0{it?T'). 
Therefore, our algorithm works better than the P-S algorithm when solving equations 
in ()3ip has similar complexity as solving equations in (llSp and ()2ip . but may work 
relatively worse if (jSip can be solved explicitly (see [2] for several examples). This 
tradeoff is demonstrated in the numerical experiments in Section [H 

There are two main drawbacks for Algorithm 1. First, it can only handle separable 
objective functions and inequality constraint in ([3]). Second, it involves many evalua- 
tions of and also has to solve the equations and ([2T]) . These evaluations might 




(31) 
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be very expensive in computation if does not have a simple form. This is the same 
problem as in the P-S algorithm. In particular, 0] shows that the performance of the 
P-S algorithm may not be very good if close form solutions to equations (j3ip do not 
exist. To overcome this problem, we propose a gradient projection method in the next 
section. The gradient projection method uses Algorithm 1 as a subroutine, however, 
in each iteration, g{-) is simply a linear function. Moreover, the gradient projection 
method can handle non-separable objective functions as well as equality constraints 
in (l3|). The tradeoff however, is that the gradient projection method does not give 
an exact solution in a finite number of iterations. However as we demonstrate in our 
numerical experiments, it performs quite well in test problems. 

3 Gradient Projection Method 

In this section, we propose a gradient projection method to solve (PI). First we 
claim that we can assume that constraint ^ is of the inequality form. To transform 
a problem with equality constraint in ([3]) to an inequality one, we first note that we 
can without loss of generality assume /?„, = oo. This is because one can always add 
a penalty term M{yn — Pn)~^ with sufficiently large M so that the optimal solution 
must satisfy Un < Pn (if the problem is feasible). Then, we can simply substitute 
Un = Y17=i'^i ~ J27=i Vi ™to ([1]). Therefore, it is sufficient to consider the following 
equivalent problem: 

(P4) minimize^- F(y) = /(yi, y„) 

subject to Yli=i Vi < EiLi Oj, VA; = 1, n 

0<yi<A, Vi = l,...,n. (32) 

In the following, we propose a gradient projection method to solve (P4). Gradient 
projection methods are used to solve a variety of convex optimization problems It 
minimizes a function F{x) subject to convex constraints by generating the sequence 
[x'^^'i] via 

where V^'^^ is the (sub)gradient oi F{x) at x^^\ n'^(x) = argumiy — y|| : y € J-^^^^ 
is the Euclidean projection of x onto the feasible direction space J-^^^ at x^^^ and 7]^ 
is a chosen step size. In order for such methods to work efficiently, one needs to find 
an efficient way to calculate Ii^^\x) for given x. Our discussion will mainly focus on 
the projection step, the discussion of the outer iterations of the projection gradient 
method is referred to 

Assume that for a feasible solution y, the binding constraint set of (j32p is 5'(y), 
that is, X]i=i y« = Yl!i=i'^i fo'^ ^ ^ S{y)^ the active nonnegative constraint set of ([32]) 
is T'(y), that is, = 0, for k E T{y) and the active upper bound constraint set of ([32]) 
is -R(y), that is, yt = fik-, for k G Riy)- Then the Euclidean projection of a descent 
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direction f of F at y onto the feasible direction space can be computed by: 

minimize^ 1 1 3^ ^ '^^ 1 1 2 

subject to X]i=i — Oi V/c e S{y) 

x^ > 0, Vi G T{y) 

X, < 0, Vi € R{y). (33) 

Now we show how ()33p can be transformed into a problem of form (P3). Suppose x* 
is the optimal solution to (p3]l . It is easy to see that — f Hi < ll^lli since x = is 
a feasible solution. Therefore, x| > — ||f||2 for all i. Define 



Zi 



Xi-Vi + \\v\\2 i^T{y) 
Xi i G T{y). 



and 



Vi 



\\v\\2 i^T{y) 
Vi i G T{y). 



Then we can rewrite (1331) as 



V, 



^2 



where 



7fc = mm < 



minimize^ Y17=ii^i 
subject to Yli=i ^ 7fc, VA; = 1, n 

< < t'i, Vz = 1, n, (34) 



n||t;||2 + ||v||i,min < A: > A;, A; G ^(y) : ^ (||?^||2 — 'Vi) 

i=l,ieS(y) 



\\v\\2-Vi i e R{y) 
+0O i i?(y). 

Note that (|34p is of form (P3) thus can be solved by Algorithm 1. One main 
advantage of ()34p is that the objective function is quadratic. Therefore, in Algorithm 
1, gi{zi) = 2{zi - Vi), k = -2vi, hi = 2{vi - Vi) and gl^iui) = Hi+2H. Therefore the 
equation (|18p and similarly (|2ip can be written as 



/max(0,min(2i/i,2i), -X^f .+^A^, -C)) \ 
^(0 = 2^ ^ =0. (35) 



+1 



Note that (j35p is a decreasing piecewise linear function with no more than 2n break- 
points. And those breakpoints can be computed explicitly. Therefore, to solve (j35p . one 
can first use binary search to find out which piece of the function the solution belongs 
to and then simply solve a linear equation. Therefore, the total complexity of solv- 
ing ()35p is O(nlogn) and the total complexity of each projection step is 0(n^ log^n), 
regardless of the form of the objective function. 
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# 


rrobiem 


n 


UM 


Gr 


L VA 


r-b 


1 


(TP-1) 


50 


0.050 


0.214 


0.378 


0.154 


2 


(TP-1) 


150 


0.370 


0.681 


1.792 


3.751 


3 


(TP-1) 


500 


5.614 


2.559 


15.76 


219.8 


4 


( iP-1) 


2000 


242.6 


34.58 


4953.1 


AT / A 

N /A 


5 


(TP-2) 


50 


0.018 


0.093 


0.264 


< 0.001 


6 


(TP-2) 


150 


0.136 


0.176 


0.547 


< 0.001 


7 


(TP-2) 


500 


1.911 


0.806 


14.04 


0.0013 


o 
o 


v ) 


9nnn 
zuuu 


Oo. io 


o. iou 


A 7QS Q 
4 ( yo.y 


U.UUZD 


9 


(TP-3) 


50 


0.011 


0.088 


1.001 


0.227 


10 


(TP-3) 


150 


0.015 


0.124 


4.950 


0.966 


11 


(TP-3) 


500 


0.210 


0.374 


49.44 


6.210 


12 


(TP-3) 


2000 


0.499 


1.526 


2350.2 


91.66 



Table 1: Performance Comparisons. DM is the dual method developed in Section |2 GP is 
the gradient projection method developed in Section [3] and P-S is the algorithm in |2|. N/A 
means this method can not return the optimal solution in the corresponding case 



4 Numerical Experiments 

In this section, we perform numerical tests to examine the performance of both our 
dual method and the gradie nt p rojection method and compare them to 1) the P-S 
algorithm in 3 and 2) CVX [ij. The P-S algorithm also uses a dual method and the 
comparison between it and Algorithm 1 is discussed in Section [2j CVX is a popular 
convex optimization solver which uses a core solver SDPT3 or Sedumi to solve a large 
class of convex optimization problem. It is based on interior point methods. In the 
following, we consider three sets of problems. For each one, we test 30 random instances 
with input specified in the following (for problem with size n = 2000, we only test 3 
instances). Note that the default precision of CVX is e = 1.5 x 10~^. In our dual 
method, we solve each equation with precision e, and for the gradient method, our 
stopping criterion is that the objective is within e to the CVX optimal value. All the 
computations are run on a PC with 1.80GHz CPU and Windows 7 Operating system. 
We use CVX Version 1.22 and MATLAB version R2010b. The test results are shown 
in Table [H 

The first problem is 

(TP - 1) minimize Ya=i {IVi + ViVi) 

subject to ^f^i y, > J2i=i ^i, V/c = 1, n - 1 

Hi > 0, = 1, ...,n. (36) 

This problem is considered in the numerical tests in 0]. We use the same setup, that 
is, we assume all the parameters ai and Vi are drawn from i.i.d. uniform distributions 
on [0,1], with Uj's sorted in ascending order. In order to apply our dual method 
and gradient projection method, we first perform a transformation as described in 
Section II. 1[ Note that the equality constraint in (i36]l can be replaced by an inequality 



13 



constraint since the objective of ()36p is increasing in y. Then we add an artificial upper 
bound /3 = E^Li Q^i Vi^ ^'^^ define Zi = (3 — yi. After these transformations, the 

problem becomes 

minimize^ Ya=i fii^i) = Er=i(i(/? - ^i)^ + MP - Zi)) 
subject to Yli=iZi ^Yli=iW - ^i)^ \/k = l,..,n 

Zi > 0, y i = 1, .., n, 

which can be solved by both our dual method and the gradient projection method. 

From Table [H we can see that both our algorithms outperform the P-S algorithm 
and CVX for this problem. The reason that the dual algorithms performs better 
than the P-S algorithm is explained in Section [2j Particularly, in this case, {x) = 
(x — Vi)^^^ and the equation (I3ip does not have a close form solution. In such cases, 
the complexity of solving ()3ip is essentially similar to the complexity of solving (jlSp or 
()2ip . And our dual algorithm is faster than the P-S algorithm in an order of n. 

Note that in this problem 

Zi = arg ^mm^fi{z) = /3. 

Therefore L = n, that means this is already the worst case scenario for the dual method. 
Yet it still performs quite well. Because of the same reason, the gradient projection 
method works better than the dual method in this case. And both of them perform 
better than CVX significantly. 
The second problem is 

(TP - 2) minimize^; Eti 

subject to Yli=i Vi > Yli=i = 1' - 1 

i=i Vi = Li=i 

0<2/i<l, Vi = l,...,n. (37) 

This problem is also considered in We again use the same setup, where and 
Vi are drawn from i.i.d. uniform distributions on [0,1], with ViS sorted in ascending 
order. Similar to what we have done for problem 1, we replace the equality constraint 
with inequality and define Zi = 1 — yi. An equivalent form of p7p is then obtained as 
follows: 

minimize^- XlILi fii^i) = EILi f 
subject to Yli=i Zi<k - Yh^i Ui, Mk = 1, n 
< < 1, Vi = 1, n. 

In this case, g~^{x) = ^ —Vijx. As shown in 0], there is a close form solution to 
()3ip in this case. Thus the P-S algorithm can solve this problem very fast. This is 
indeed observed in Table [TJ Note that the performance of both the dual and gradient 
projection methods also improve. This is partly because it is easier to evaluate the 
function g~^ in this case than in the first problem. Still, the gradient projection 
method is faster than the dual method in this case, because we have Zi = \ thus L = n 
in the dual method. Again, both algorithms work much faster than CVX. 
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The last problem is the inventory control problem described in Section 11.21 The 
optimization problem is: 

(TP - 3) minimizej; YIl=i {urE{D, - y,)+ + OiE{yi - D,)+) 

subject to Yli=i Vi < Y!i=i "i' = 1' 

yj > 0, Vi = l,...,n. 

In the numerical experiments, we assume that Oj ^ f/[5, 10], Uj ~ C/[20, 25] and 
ai ~ C/[0, 20]. We also assume that each Di follows an exponential distribution with 
parameter r]i ~ ?7[0. 1,0.2]. By applying the property of exponential distribution, the 
objective can be equivalently written as: 

i=i i=i ^ '* / 

with Vi = ^log (^^) G [5.49,17.92], g,{yi) = f{y.{) = -{m + Oi)e-'^^y^ + o, and 

g~^{s) = — ;^log ^ u^+o ) • Clearly ()3ip doesn't have a close form solution with such 

g~^, therefore our algorithms outperform the P-S algorithm. In fact, for this problem, 
the dual method works very well. This is because the L's in this case are usually 
much smaller than n. In fact, L is less than n/A in most test problems. This will 
greatly reduce the computations in the dual method and make it very efficient. The 
gradient method could not take advantage of this structure and thus only has similar 
performance as in other problems. 

To summarize the numerical results, we observe that our algorithms exhibit signif- 
icant performance improvement over the P-S algorithm when equation (j3ip does not 
have a close form solution. And they also greatly improve over the performance of 
CVX. Between the dual method and the gradient projection method, the former one 
is more efficient when L is relatively small, otherwise, the latter one is usually more 
efficient. 

5 Conclusions 

In this paper, we propose two algorithms for solving a class of convex optimization 
problems with linear ascending inequality constraints. When the objective is separable, 
we propose a dual method which improves the worst case complexity of the algorithm 
proposed in 0]. Furthermore, we propose a gradient projection algorithm in which 
each projection step uses the dual method as a subroutine. The gradient projection 
algorithm can be used to solve more general non-separable problems and does not need 
to evaluate the inverse of the gradient function which the dual methods usually require. 
Numerical results show that both of our proposed algorithm work well in test problems. 

6 Acknowledgement 

I thank Diwakar Gupta and Shiqian Ma for useful discussions and Arun Padakandla 
and Rajesh Sundaresan for sharing the code of the P-S algorithm with me. 



15 



A Proof of Lemma 



Proof of Lemma [2l We prove by contradiction. If A^^, = 0, then 
1. If there exists r' = max{0 < r < r* : A^,, > 0}, then we know that 

a-as)<0. 
Therefore, since r' < r*, we know that 

s=wj_^+l \ V t=r'+l / / 

On the other hand, we have 



E 
E 



Et=r*+1 '^wSj 


- as 


Ef=r'+1 ^^'t)) 


- as 


Ei=r*+1 ^wt) ^ 


- 


Et=r'+1 ^^'t)) 


- as 


- a^) < 0. 





(38) 

Here the first equality is because of the assumption that A^^ = for all r' < r < 
r*, and the second equality is because of the induction assumption. However, 
(1381) contradicts with the definition of r*. 



2. If all A^^ = for r < r*. Then we have 



where in the last inequality, the first term is less than since the algorithm enters 
Case 2 in this loop, and the second term is less than or equal to due to the 
induction assumption. Therefore we have proved Lemma [21 □ 
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